Kernels for deep learning - with and without tricks∗
نویسنده
چکیده
Neural networks have recently re-emerged as a powerful hypothesis class, yielding impressive empirical performance in multiple domains. However, their training is a non-convex optimization problem which poses exciting theoretical and practical challenges. Here we argue that by extending the class of neural nets, one can obtain a convex learning problem, whose practical solution relies on the evaluation of a particular kernel (i.e., the kernel “trick”). We show that in some cases this kernel can be calculated in closed form. We next turn to the case where the kernel cannot be evaluated in closed form, and introduce a sampling based algorithm for learning with the same hypothesis class. Our regret based analysis shows that the sample complexity of the sampling algorithm is similar to that of an algorithm that uses the exact kernel. Empirical evaluation shows that the method is competitive with other kernels and sampling based algorithms. ∗Machine Learning External Seminar, Gatsby Unit, July 20, 2016.
منابع مشابه
cuDNN: Efficient Primitives for Deep Learning
We present a library that provides optimized implementations for deep learning primitives. Deep learning workloads are computationally intensive, and optimizing the kernels of deep learning workloads is difficult and time-consuming. As parallel architectures evolve, kernels must be reoptimized for new processors, which makes maintaining codebases difficult over time. Similar issues have long be...
متن کاملAdaptive Normalized Risk-Averting Training for Deep Neural Networks
This paper proposes a set of new error criteria and learning approaches, Adaptive Normalized Risk-Averting Training (ANRAT), to attack the non-convex optimization problem in training deep neural networks (DNNs). Theoretically, we demonstrate its effectiveness on global and local convexity lower-bounded by the standard Lp-norm error. By analyzing the gradient on the convexity index λ, we explain...
متن کاملPrediction of Iranian EFL Learners’ Learning Approaches Through Their Teachers’ Narrative Intelligence and Teaching Styles: A Structural Equation Modelling Analysis
It goes without saying that there are many influential factors affecting the success of any learning experience, and teachers are definitely among the significant factors influencing the process of teaching and learning. In this respect, the present study sought to investigate the prediction of Iranian English as a Foreign Language (EFL) learners' learning approaches through their teachers’ nar...
متن کاملEnsemble Kernel Learning Model for Prediction of Time Series Based on the Support Vector Regression and Meta Heuristic Search
In this paper, a method for predicting time series is presented. Time series prediction is a process which predicted future system values based on information obtained from past and present data points. Time series prediction models are widely used in various fields of engineering, economics, etc. The main purpose of using different models for time series prediction is to make the forecast with...
متن کاملLearning with Hierarchical Gaussian Kernels
We investigate iterated compositions of weighted sums of Gaussian kernels and provide an interpretation of the construction that shows some similarities with the architectures of deep neural networks. On the theoretical side, we show that these kernels are universal and that SVMs using these kernels are universally consistent. We further describe a parameter optimization method for the kernel p...
متن کامل